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1 Introduction 

The occupancy scheme is a simple urn model which often occurs in probability, statistics, 
combinatorics and computer science. We can cite for example species sampling [91 [19] . 
analysis of algorithms [IB], learning theory [§], etc. The books by Johnson and Kotz [22] 
and by Kolchin et al. [23] are standard references. The present work is partly motivated by 
the study of fragmentation trees. 

Let us recall the occupancy scheme. Let X be a countable set and p = (pi : i G X) 
be a probability measure on X. The occupancy scheme on (X, p) is described as follows. 
For all % 6 X such that Pi ^ 0, one places a box at i. One then throws successively and 
independently n balls in the boxes by assuming that each ball has probability Pi of falling 
into the box located at i. We may be interested in the number of boxes containing exactly 
j balls, or in the number of occupied boxes, etc. 

We consider here a variant of the occupancy scheme which corresponds to a nested family 
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of boxes. We introduce the infinite genealogical tree U : 

oo 

U := \jN k , 

k=0 

with N = {1, 2, . . . } and the convention N° = {0}. The elements of U are called individuals, 
and for every integer k G Z + , the k-th generation of U is formed by the individuals i G N k 
(we write \i\ = k). For every individual i = (ii,...,ik) of U and j G N, the individual 
ij = (ix, . . . ,ik,j) is called the j-th child of i and i is the parent of ij. We suppose a real 
number pi G [0, 1] is assigned to each individual i, such that p = 1 and J2jenPij = Pi f° r 
all i EU. Note that for all k, p(k) := (pi : |i| = k) is a probability measure on N fc . We can 
couple the occupancy schemes on (N k ,p(k)) as follows. Each individual i should be viewed 
as a box, such that the box ij is contained into the box i for every i G tl and j G N such 
that pij 7^ 0. Initially the n balls are thrown in the box located at 0. Then one places 
successively and independently the n balls in the boxes of the first generation by assuming 
that each ball has probability pi of falling into the box situated at i G M. Likewise, by 
iteration, a ball located in the box % is placed independently of the others into the sub-box 
ij with probability Pij/pi- 

We denote by H n j the first generation at which all the boxes have less than j > 2 balls 
when n have been thrown (H n j is called a height) and by G n ,j the first level at which there 
exists a box containing less than j > 1 balls [G n j is called a saturation level). Our aim is 
to study the asymptotic behaviours of H n j and G n j as n tends to infinity when we consider 
a certain randomized version of (pi,i G U). 

More precisely, in the present work, we shall assume that we are given a random prob- 
ability measure p = (pi,p 2 , • • • ) on N. We assign to each individual i an independent copy 
p(i) of p. The real numbers pi, i <EU, are defined by induction : p := 1 and p^ := PiPj{i), 
for all i G U and j G N. It means that p[i) describes how the mass of the individual i is 
splitted to its chidren. This model is called the occupancy scheme of multiplicative cascades. 
We put emphasis on the fact that there are two levels of randomness in our model, namely 
the arrangement of boxes and the way one throws balls. 

In the particular case when p is supported by a finite number of integers, in the sense that 
#{j : pj > 0} < b a.s. for some integer b > 2, the height H n j has a natural interpretation 
in terms of a special class of random split trees which have been considered e.g. by Devoye 
|14j . Specifically, imagine that each box has a rupture threshold of j, in the sense that when 
a ball falls into some box i already containing j — 1 balls, then this box is removed and the 
j balls are shared out amongst the children of i according to the random probability p(i) 
(i.e. conditionally on p(i), each ball is put in the box ij with probability Pj{i), independently 
of the other balls). This procedure yields a random tree where all balls are stored at leaves, 
and H n j is the height of this tree when n balls have been thrown. 

We further point out that the height H n j also arises as a natural shattering time in ho- 
mogeneous fragmentation chains, a class of partition-valued Markov chains. More precisely, 
the shattering time is defined as the first instant when all the blocks of the partition process 
have cardinality less than j. See [1] for background and Section 3.2 in j5] for a description 
which is closely related to the present work. In a different direction, we mention the work of 
Haas et al. [IB] who associate another random tree to homogeneous fragmentation processes. 

We shall show the following result : there exist an integer j* G {2,3,...}U{oo} and a 
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sequence of positive real numbers G% > ■ • ■ > Cj* = Cj* + i = ■■■=: C* such that for every 
integer j > 2, 

H n j ~ Cjlnn a.s. 

71— >0O 

It should not be surprising that the heights H n j have logarithmic asymptotics, as a large 
class of trees have such a behaviour, like random split trees. On the other hand, it is 
remarkable that there exists a critical parameter j* G {2, 3, . . . } U {oo} at which a phase 
transition occurs (provided that 2 < j* < oo). It is a peculiar behaviour for a random split 
tree. Devroye indeed proved in [T3] that for a large class of random split trees such that 
their internal nodes contain at least one balfi then all the heights, regardless of the value of 
the rupture threshold j, have the same asymptotic behaviour (in probability). In particular, 
no phase transition occurs. 

It may also be interesting to consider the situation where the parameter j depends on the 
number of balls n. In the case of power functions, we shall prove that under some technical 
conditions that will be detailed below, 

H n , n a ~ (1 — a)C*lnn a.s., 

' n— >oo 

for all a G (0, 1), so that there is no phase transition in the asymptotics of H n ^ n a. 

Concerning the asymptotics of the saturation level G n j, we shall prove the following 
result : under some technical conditions that will be detailed below, there exists a constant 
C* < C* such that for every positive integer j, 

G n ,j ~ C*lnn a.s. 

' 71— +OO 

In particular, there is no phase transition in the asymptotics of G n j. We shall also show 
that under some further technical conditions, for all a G (0, 1), 

Gn,n a ~ (1 — a)C*lnn a.s., 

' 71— >00 

so that there is also no phase transition in the asymptotic behaviours of G n ^ n a. 

Our approach essentially relies on the theory of branching random walks, and more 
precisely on their large deviations behaviours whose descriptions are due to Biggins [Zj. The 
construction of the boxes given by the multiplicative cascades indeed enables us to define 
a branching random walk giving the sizes of the boxes at each generation. We shall see 
that the critical parameter j* and the real numbers C2, . . . ,Cj* and C* are described by 
that branching random walk. Another key technique is Poissonization; instead of throwing 
exactly n balls, one throws V n balls, where V n is a Poisson variable with parameter n which 
is independent of (p(i),i G IX). For every integer k, conditionally on the sizes of the boxes of 
the k-th generation, the numbers of balls per box of the k-th generation are thus independent 
Poisson variables. 

The results will be stated in Section [21 We shall see the main techniques in Section [31 
Section [4] will be devoted to the study of the heights. We shall first turn our attention to 
the upper bound. Due to the phase transition, we shall give two different proofs to show the 

1 We stress that this assumption is crucial in the proof of Theorem 1 in [TJ] and seems to have been 
overlooked in the statement. 



3 



lower bound of H n j, depending on whether the integer j is less than the critical parameter j* 
or not. The results on the saturation levels will be proved in Section [5j Finally, in Section [6], 
we shall explain heuristically why a phase transition may occur in the asymptotics of H n j, 
but not in those of H n>n a, G n j and G n>n a. 



2 Formulation of the main results 

Recall that p is a random probability measure on N. We denote its law by v. If we denote by 
Probpj the space of probability measures on N, v is a probability measure on Probpj, called 
the splitting law. We assume that v is not geometric^, in the sense that there is no real 
number r £ (0, 1) such that with probability one, all the masses of the atoms of p belong 
to {r n ,n £ Z + }. In particular, the degenerate case when p is a Dirac point mass a.s. is 
therefore excluded. 

As explained in the introduction, we consider a family (p(i),i £ U) of independent 
copies of p labeled by the individuals of the genealogical tree U. The multiplicative cascade 
construction defines for each generation k a probability measure (pi, \i\ = k) on N k . Taking 
logarithm of masses, we may encode the latter by the following random point measure on 
R + 

i:\i\=k 

where 5 Z stands for the Dirac point mass at z. Note that if p^ = 0, i.e. if there is no box at i, 
the individual i is omitted in the sum defining Z^. Likewise, in the sequel, we shall always 
consider individuals which have a positive mass. It follows immediately from the structure 
of the multiplicative cascades that (Z^ k \ k £ Z+) is a branching random walk, in the sense 
that for every integers k, k' > 0, Z^ k+k ) is obtained from Z^ by replacing each atom z of 
Z^ by a family {z + y,y £ 34}, where each 34 is an independent copy of the family of 
atoms of Z^ k 

Let us introduce quantities defined via the splitting law v. First, we define the Laplace 
transform of the intensity measure Z^ by 



L(0) :=E[(Z^,e- e -)} 



for 9 £ R. We can also write 

L(0) = E 



.i€N 



Probfj 




"(dp) 



with the convention that p e = when p = even when 9 < 0. Because p is not a Dirac point 
mass a.s., L(0) > 1. The function L : K. — > (0, oo] is decreasing with L(l) = 1. We define 

9 := inf{9 £ R : L(6) < oo}, 



2 Working with a geometric splitting law would induce a phenomenon of periodicity which we shall not 
discuss here for simplicity. However results similar to those proven in this work can be established by the 
same techniques for geometric splitting laws. 
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so that L(#) < oo when 6 > 9_. Note that 9 may be negative. We can show from Holder's 
inequality that InL is a convex function, which implies the convexity of L and that the 
function 

is increasing on (8,0) and decreasing on (0, oo). As L(l) = 1 and L decreases, we have 
<p(l) = —1/(1) > 0, and thus the set of 8 G (8_, oo) such that ip(9) > is a non-empty open 
interval (9*, 9*) : 

9* := inf{# > 9 : <p(9) > 0} and 9* : = sup{# > 9 : <p(0) > 0}. 

Note that 9* > 1 and 9* < if 9_ < 0. The convexity of InL implies that — L/L' is an 
increasing function. We define 

C.:=liml-£^L and C* := lim | 

0J.0. L (6>) ere* L (0) 

Inspired by the article of Hu and Shi in [21] (see Lemma H] below) , we shall sometimes 
need the following assumption : there exists 5 > such that 

L>(—6) < oo and / 2J !pi>o I ^(dp) < 00. (1) 

JProb N \ im J 

To study the asymptotics of the saturation levels, we shall sometimes need the following 
hypothesis : 

- 00 < 9* < and <p(0*) = 0. (2) 

We can now state the results that we shall prove. Concerning the asymptotic behaviours 
of the heights H n j, we have the following results which complete and improve Proposition 2 
in [5]. 

Theorem 1 Let j > 2 be an integer, set 



Then 

More precisely, 



-jyinL(j) ifj<9\ 
C* if] > 9*. 



H n j ~ Cj Inn a.s. 



H n ,j < Cjlnn + O (In Inn) a.s., (3) 
and (TJJ] is an equality if j < 9* or if (TJp holds. 

We see that there is a phase transition in the asymptotic behaviour of H n j at the integer 
\9*~\ when 2 < 9* < 00. We point out that it may happen that 9* < 2 or 9* = 00, in which 
case there is no phase transition. For instance, one can show that if p = (pi, p%, . . . ) with 
px = l- 0.7517, pj = 0MU for all j e 2, . . . , 16 and p-j = for all j > 17, where U is 
uniformly distributed on (0, 1), then 9* < 1.99. On the other hand, for every a G [1/2, 1), if 
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p is equal to (1/2, 1/2, 0, . . . ) with probability a and to (1/3, 1/3, 1/3, 0, ... ) with probability 
1 - a, then 9* = oo. 

In this direction, we also point out that the critical parameter 9* is finite whenever 



maxp, oo = 1, 

where p = (pj,j G N) denotes a random probability measure on N with law v. Indeed, it is 
easily seen that 

lim L(9) 1 ^ 9 = II max p ? - 1 Icq. 
e->oc jeN J 

If the right-hand side equals 1, then g : 9 i— > — lnL(0)/# has limit at infinity. Now g is 
derivable and g{l) = 0, so there exists 9q G (1, oo) such that «7'(#o) — 0. As g'{9) = 9~ 2 ip(9), 
we conclude that 9q = 9* < oo. 

We may also be interested in the asymptotics of i? n ,n a , where a G (0, 1). We shall prove 
the following result. 

Proposition 1 Suppose 9* < oo. Let a G (0, 1). Then 

H n ,n a ~ (1 — at)C* Inn a.s. 

Furthermore, 

H n , n a < (1 - a)C* Inn + O(lnlnn) a.s., (4) 
and OU zs an equality whenever (T7]j aoWs. 

Remark 1 The final assertions in Theorem [T] and Proposition [T] rely on the work of Hu 
and Shi [21]. McDiarmid's setting in [21] can however be considered; we can prove that the 
results stated in Theorem [1] and in Proposition [[] still hold if the assumption (CE]) is replaced 
by the following : 



Probfs 



l ft >o I ^(dp) < oo. (5) 

i£N J 

For instance, suppose that p — (pi, 1 — pi, 0, 0, . . . ), where p\ is a random variable with 
density \§ <x<e -\x~ l In -2 xdx. Then 9 = 0, so (CD) does not hold. Nonetheless, p is supported 
by two integers a.s., so ([5]) holds. As a result, for all j > 2, H n j = Cj Inn + O(lnlnn) a.s. 
Furthermore, as || maxj 6 NPj||oo — 1 5 the discussion below Theorem [1] ensures that 9* is finite, 
so for all a G (0, 1), H nn a = (1 — a)C* Inn + O(lnlnn) a.s. 

For the sake of simplicity, we shall show Theorem [T] and Proposition [T] only in the setting 
of [2T] (see Proposition H] below). The general proof can however be easily carried out. 

Concerning the asymptotic behaviours of the saturation levels G n j, we shall prove the 
following theorem. 



Theorem 2 Let j > 1 be an integer. 
• If 9* = —oo, then 



G n j ~ C* Inn a.s. 
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Suppose that (0|) holds. Then 



G n j ~ Inn a.s. 



More precisely, 

G n j > Inn + O(lnlnn) a.s., (6) 
and (Ujj) zs an equality if no/ds. 

A sufficient condition to guarantee that — oo < 6* < and <^(#*) = is : 

9 > — oo and lim T L(#) = oo. 

Indeed, suppose this condition fulfilled. Imagine that for all 6* < 0, > 0. Let 9 E (9, 0). 
As 9 i— > lnL(#)/0 is decreasing (its derivative is —9~ 2 ip{9) < 0), we have for all 9 < 9 : 

lllL(9) < 9 !^<> L W 



9q 9q 

which contradicts Ywiq^qL{9) = oo. 

We shall also show the following proposition. 

Proposition 2 Suppose that (TJ|) holds. Let a G (0, 1). Then 

G nin cx ~ (1 — a;)C*lnn a.s. 

Moreover, 

> (1 — a)C* Inn + O(lnlnn) a.s., (7) 
and (0) zs an equality if (CP no/ds. 

We now conclude this section by discussing an illustrative example. Consider the case 
p = (U, 1 - U, 0, 0, . . . ), where U is uniformly distributed on [0, 1]. Then L(0) = 2/(0 + 1), 
so = —1 and (CD) holds. The discussions below the theorems yield : 9* < oo and (J2]) hold. 
Hence, all our results may be applied. Easy calculations yield ip{9) = In 2— \n(9+l)+9/ (9+1) 
and [0*] = 4, so that there is a phase transition. We can show that Ci = 2/ln(3/2) pa 
4, 93260..., C 3 = 3/ In 2 w 4, 32808... and that C* > are the solutions of the equation 

, fA c-l 

i.e. C* w 4, 31107... and w 0,37336... 

As p is supported by two integers, our model may be interpreted in terms of random 
split trees; the procedure described in the introduction yields a random tree where all balls 
are stored at leaves. We denote by T n j the tree obtained when n balls have been thrown 
and when the boxes have a rupture threshold of j. We now define another random split 
tree T nj - also related to our model. We imagine that each box has a rupture threshold of j, 
but when a ball falls into some box i already containing j — 1 balls, it remains in that box 
and the j — 1 other balls are shared out amongst the two children of i independently and 
with probability (1/2, 1/2). No other ball is then allowed to be stored at the box i : when a 
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ball falls into i, it is placed into one of the two children of % with probability (1/2, 1/2), and 
one has to consider whether the ball stays at that box or not. We denote by T n j the tree 
obtained when n balls have been thrown. Note that each internal node of T n j has exactly 
one ball, whereas its leaves have less than j balls. We see that the height H n j of the tree 
T n> j is less than or equal to the height H n j of the tree T n j. 

It is remarkable that T n ^ has the law of the random binary search tree. Robson [27] was 
the first to be interested in the height of the random binary search tree. In [25] , Pittel studied 
its height and its saturation level. Devroye proved in [TTl [12] that the saturation level is 
asymptotically equivalent to In n in probability as in our model, but that the height H n 2 
is equivalent to C*\nn in probability, so that H n2 is not equivalent to H n2 - Regarding the 
random binary search tree as a random split tree, he proved in [13] that all the heights H n j 
(regardless of the value of j) of the random binary search tree are asymptotically equivalent 
to C* In n in probability. In particular, there is no phase transition, contrary to our case. 
There is of course no contradiction, as the random binary search tree does not correspond 
to a model treated in this work. Indeed, in terms of random split trees, internal nodes of 
the binary search tree retain exactly one ball whereas all the balls are stored at leaves in our 
case. 

3 Preliminaries 

3.1 Some results on branching random walks 

In this section, we recall some results on branching random walks that will be used in the 
proofs. 

We begin by stating a key result obtained by Biggins in [7] . For every 8 > 9, we introduce 
W {k \9) :=L{9)- k (Z^ k \e- d -) = L(8)- k Pi- 

i:\i\=k 

For every k G Z + U {oo}, we denote by the cr-algebra generated by (pj, |z| < k). 

Lemma 1 For every 9 > 9, (W^(8), k G Z+) is a martingale with respect to the filtration 
k&+- Moreover, if 9 & (9^,9*), it is bounded in L 7 (P) for some 7 > 1 and therefore 
uniformly integrable, and its terminal value 

W{9) := lim W {k \9) 

is positive a.s. 

Applying Corollary 4 in [7] , we can obtain a precise estimate of the number of boxes at 
generation k with size of order exp(/cL'(0)/L(#)). Specifically 

Lemma 2 For all real numbers a and b such that a < b and for all 9 G (8*, 9*), we have 
with probability one that 

lim Vke-WtyU G N k : exp (-6 + kU{8)/h{8)) < Pi < exp (-a + kU{8)/ h{8))} 

fc^oo 




s 



We next turn our interest to the asymptotic behaviours of extreme sizes of boxes at 
generation k : 

p(k) := inf {pi : pi > 0, \i\ = k} and p(k) : = sup {pi : \i\ = k} . 
We have the general following results : 
Lemma 3 If 6* < oo, then 

lim (— \np(k) — k/C*) = oo and lim — - - = — a.s. 

k^oo fc^oo k Cf * 

Likewise, if (Uj) holds, then 

i \ lnp(fc) 1 
lim (lnp(fc) + k/C A = oo and lim = — = — a.s. 

k^oo — k^oo k 

To have precise estimates of the behaviours of the heights and of the saturation levels, we 
shall sometimes need sharper results of the asymptotics of p(k) and p(k). The following 
result, proved by Hu and Shi in [21], will be very useful. 



Lemma 4 We assume that (T7]j holds. 

• If 9* < oo, then 

-\np(k)-k/C* 3 

hm sup — = — a.s. 

fc^oo In k 29* 

• If & holds, then 

\np(k) + k/C, 3 

lim sup — = — : — : = — a.s. 

fc^oo In k 29, 

Addario-Berry and Reed in [2] also studied the minima in branching random walks, but they 
require a stronger condition than ([T]). Their assumption is however fulfilled for random split 
trees. 



3.2 Poissonization 

Let us present the methods used in the proofs. We have to consider the number of balls 
belonging to each box at each generation. Now, conditionally on jF fc , when n balls have been 
thrown, the number of balls in the box i, \i\ = k, follows a binomial law of parameter npi. 
Furthermore, these random variables are not independent. A classical idea to circumvent 
those difficulties (see for instance Gnedin et al. in [17J or Hoist in [20]) is to consider a 
randomized version of the total number of balls : instead of throwing initially n balls, one 
throws V n balls, where V n is a Poisson variable with parameter n which is independent of 
(p(i),ieU). 

More precisely, we suppose we are given a standard Poisson process (V x ) x>0 independent 
of T^. For every individual i G U and for every x G (0, oo), we denote by C(i; x) the number 
of balls at i when the first V x balls have been thrown. For all x, y G (0, oo) and k G Z + , we 
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denote by J\f X;y (k) the number of boxes at generation k containing at least y balls when the 
first V x balls have been thrown : 

KM :=#{teN k :C(t;x)>y}. 

Conditionally on jF fc , the random variables (C(i;n))\i\=k are independent Poisson variables 
with parameters npi. That is why in our proofs, we shall first focus on Af n j(k). We shall 
then show that J\f n j(k) is close to the number N n j(k) of boxes at generation k containing 
at least j balls when exactly n balls have been thrown. Similarly, we denote by Ai x ,y{k) the 
number of boxes at generation k containing less than y balls when the first V x balls have 
been thrown : 

M x , y (k) :=#{teN k :C(t;x)<y} 

and by M n j the number of boxes at generation k containing less than j balls when n balls 
have been thrown. 

We now prove two estimates using the technique of Poissonization that will then be used 
in the sequel. 

Lemma 5 Let p G (0, oo). There exist two finite constants c{p) and d{p) such that 

sup sup P (P x > j) fx~ p < c{p) , (8) 

j>p x>0 

and 

sup sup P (P x < j) j' p x p < d(p). (9) 

Proof : We begin by showing (JHJ). Let j > p be an integer. Let x > 0. By Markov's 
inequality, 

p (v x > j) = p {vi > f) < rm \vi\ Vx> _ 3 \ . 

Therefore we only have to bound from above 

CO 

x- p E[V p l Px > 3 }=x- p Y,k P e- X ^- 

k=j 

As T(k — p + l)k p /k\ — > 1 as k tends to infinity, there exists a finite constant c(p) > 1 such 
that for all k > p — 1, 

k p ^ c(p) 



k\ - T(k-p+l)' 
As j > p > p — 1 , we thus have 

E k " e ~'j\ £ E TW^TT) - E FiFmTTT)' 

k=j k=j K y ' fc=0 v ' 

where u := j — p > 0. Applying the formulae 6.5.1, 6.5.4 and 6.5.29 in pQ, we get that 
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which proves (JED- 

To show ([9]), we write by Markov's inequality : 



f(v x <j) = ¥ ((v x + iy p > r p ) < [{v x + iy p ] = f X> + l Y Pe ~ 



k=0 



If 

.X 



As T(k + p + l)(k + 1) p /k\ — > 1 as k tends to infinity, there exists a finite constant d(p) > 1 
such that for all k G Z+, T(k + p + l)(k + l)~ p /k\ < d(p). We thus have 



F{V X < j) < d(p)fx-?e-* V{k+p + 1 y 



k=0 

We conclude as before. □ 

Remark 2 We stress that in ([H]), the integer j cannot be less than p. This restriction lies in 
the heart of the existence of the phase transition in the asymptotics of H n j (see Proposition [S] 
below, and more precisely Lemma [6]). On the other hand, there is no restriction in (J9j); no 
phase transition appears in the asymptotics of G n j. 

4 Study of the heights 

In this section, we prove Theorem [U and Proposition [U We are first interested in the upper 
bound. We shall then focus on the lower bound. We stress that the phase transition is 
glimpsed at the beginning of the study of the upper bound (see Remark [3 below). It is 
however proved in the paragraph dealing with the lower bound. 

4.1 Upper bound 

Equation ([H]) will enable us to have a uniform upper bound of N n j(k) independent of the 
sizes of the boxes that will eventually lead to the inequalities (EJ) and (Jl|). 

Proposition 3 Let j > 1 be an integer and a G [0, 1) such that (j, a) ^ (1, 0). Then 

H n>jn a - (1 - Inn i 

hmsup — < , . a.s. 

n ->oo in Inn — InL(fc') 

for every 9 > 1 if a > 0, and for every 9 G (1, j] if a = 0. 
Remark 3 Suppose that a = 0. We deduce from Proposition [3] that : 

H n j< min h(9) Inn + O(lnlnn) a.s., 

where h is the function 9 i— > — #/lnL(#). Now, h is derivable and h'{9) = —tp{9) ln~ 2 L(#), 
so h is decreasing on (1, 9*} and increasing on [9*, oo). The minimum of h is therefore 

• -jyinL(j) ifj<9*, 
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• -0*/lnL(0*) = C* if j > 9*. 
The phase transition is shown. It will be proved in the study of the lower bound. 

Recall that H n ^ n a < k if and only if at genetation k, every box contains less than jn a 
balls when n balls have been thrown. Because we want to show that H n j n a is bounded 

from above by ^(1 — a) _ + e \ ln(ra), we take k « ^(1 — a) _ lr f L (g) + £J ln(n). In other 

words, one initially throws n exp (k - }n H e ) _ £ \\ b a N s and we show that every box 

of the k-th generation contains less than jn a balls. 
Let u and u 1 be two real numbers such that 

1 A I 1 ( 1 

u > — and u < u < — \ u — — 

(l-a)9 a V 9 

where for a = 0, the second condition reduces to u' > u. Define for every k > 1 

x k := A;~ M exp ( k — m ^(^) j an( ^ ^ ._ eX p f ^ 1 ^ u ^ ' ' 



a 9 J \ 1 — a 9 

Informally, Xk corresponds to the number of balls thrown when we consider boxes at gener- 



ation k. Note that Xk ~ exp (k 



1 -lnL(0) _ 

a £ 



As mentionned in the preliminaries, the argument relies on Poissonization. We are first 
interested in N Xyy (k) and we shall see how to depoissonize. 

Lemma 6 For almost all uj, there exists ko(uj) such that 

Nx k ,j<t>%(k) = 0, for all k > k (u) . 

Proof : Let x > and y > 9. We calculate E [Mx^k^Tk]- We write 

Nx,y{k)= ~*-C{i;x)>y = ^ ~*-C{i;x)>\y\- 

i:\i\=k i:\i\=k 

Conditionally on F k , {C{i; x))\i\ = k are Poisson variables with parameters xp i: so 



E[M x ,y(k)\F k ] = ^P(P^>M). 

i:\i\=k 



As \y~\ > 9, (JED ensures that 



E[N x , y {k)\F k ] < c(9)\y]-\x Pl 

i:\i\=k 



(see Remark [2]). Now, as y > 9, 



\y\> y > —K-y- 

y 9 
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In the notations of Lemma d], we have : 

E[K iV (k)\r k ] < c'(e) y - d x e L(e) k w^\e), 

where d(B) := c(0)(l - 1/6)-°. We finally get 

E[Af x , y (k)} <c'(%-VL(^. 

Taking x = Xfc and y = we get for all sufficiently large so that j0£ > (note that if 
a = 0, then for all k, j<f>% > 0) : 

E [AT.w«(*)] < %^ ( " H ' 
Now 0(ii'oj — w) < — 1. As a consequence 



E 



5>. 

.fcGN 



< E 



< oo. 



In particular, there is an a.s. finite number of integers k such that Nx k ,j<p a (k) > 1- 
We now show how to have information on N n j n a(k) itself. 



□ 



Proof of Proposition [31 : Let E k denote the event {V Xk < 4> k +i}- Because v! > u, <p k+ i/x k 
tends to as A; tends to infinity, so there exists an integer k\ such that for all k > k\, 
(pk+i < x k /2. Consequently, 

P (E k ) < P (V Xh < x fc /2) = P (V Xh -x k < -z fc /2) < F(\V Xk -x k \> x k /2) . 

The variance of the Poisson variable V Xk being x k , we get by Chebichev's inequality: P (Ek) < 
Ax^ 1 for all k > k±. As XlfceN^fc 1 < 00 ' ^ ne Borel-Cantelli lemma ensures that for almost all 
u), there exists k 2 (co) such that V Xk > |_0fc+ij f° r an & > ^(^)- 

Applying Lemma El we deduce that, if we define the event Q by 

Q : = {cj : there exists k 3 (u) such that iV"|_^ fc+1 j j^a (&) = for all > £3(0;)} , 

then P(fio) = 1. Notice that there exists an integer k± such that for all k > k 4 , 

1 -inh(ey 



k u > exp —k 



2(1 -a) 







There exists a rank k$ from which the sequence (4> k ) k >k 5 is increasing. Furthermore, that 
sequence tends to infinity. Let u G f2o- Let n be an integer greater than 0fc 5 large enough 
so that the unique k > k 5 satisfying <fi k < n < 4>k+i is greater than k 3 (u>) and k±. Because 
N l4>k+il,i4>%(k) = and n ^ lA-+iJ> N n,j<f>i>(k) = 0. Now, jn Q > so N n>jna (k) = and 



H n .jn a < Moreover, as n > cp k , 



k < (I - a) 







-lnL(0) 



Inn + tt'fl — a) 





lnL(0) 



In fc. 
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Further, as k > kd, we have : 



so < 2(1 — a) — pjTnT In n. Therefore 



-lnL(6») 

9 9 ( 9 

H n , jna < k < (1 - a)-—^ «'(1 - aJz^ ^ (2(1 - a )-^ Inn 

We conclude that 



F nJ n«-(l-o;)n^)lnn 

hmsup — < u ( 1 — a) — - — — — a.s. 

n-»oo In In n — In L (9) 

The fact that u' can be chosen arbitrarily close to jtz^kq completes the proof. □ 
4.2 Lower bound 

We now study the lower bound. Combined with the upper bound supplied by Proposition El 
Theorem [T] and Proposition [T] will be proved. 

4.2.1 The case j > 9* or a G (0, 1) and 9* < oo 

In this section, we prove that H n j > C*lnn + O(lnlnn) a.s. if j > 9* and ([T|) holds. We 
also prove that H n ,n a > (1 — a)C*\nn + O(lnlnn) a.s. if 9* < oo and ([T]) holds. We shall 
see that the largest box plays a key role. 

Proposition 4 We suppose that 9* < oo. Let j > 1 be an integer and a G [0, 1) such that 
(j,a) 7^ (1,0). Under the assumption (T7]) ; we have : 

. , H n . jna - (1 - a)C* Inn 3 

hmint — > — — -— . a.s. 

n^oo In Inn ~~ 21nL(6»*) 

By definition, H n j n a > k if and only if at generation fc, there exists a box containing 
at least jn a balls when n balls have been thrown. We shall see that in our setting, it 
suffices to consider the largest box. As we intend to show that H n j n a is bounded from 
below by ((1 — a)C* — e) Inn, we take k ~ ((1 — a)C* — e) Inn, i.e. one initially throws 

n « exp ^A; ^^a)cF £ )) balls and we show that the largest box of the k-th generation 
contains at least jn a balls. 

Let 7, 7' and 7" be three real numbers such that 

3 'Y 'Y — 'Y 

T > ^ > 1 and ^ < < • 

29* 1 — a a 

Note that if a = 0, the third condition is simply 7" > 7'. Define for all k G Z + 
Xfc := fc 7 exp ( /c- — — J and 0^ := k 1 exp (A;- 



l-a)C*J TK "V (l-a)C- 

We first show that a.s, Mx k ,j(j>^{k) > 1 for all integers k sufficiently large. To do so, we simply 
consider the largest box. 
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Lemma 7 For almost all u, there exists ko(u) such that 

Nx k ,j<p%(k) > 1, for all k > k (u). 

Proof : For every generation k G N, we consider an imaginary box b(k) contained in the 
largest box b(k) of size p(k) such that a ball fallen in b(k) is thrown in b(k) with probability 

|® A 1, where p(Jfe) := k^e^*. 
p(k) 

Informally, the box b(k) has size p(fc) Ap(k). We denote by the event {p(fc) < P"(&)} and 
by 5^ the event defined by 

B k := {the box b(k) contains less than j<j)% balls when the first V Xk have been thrown}. 

Conditionally on A k , the box b(k) has size p(fc) so the number of balls contained in b(/c) 
when V Xk balls have been thrown is a Poisson variable with parameter x k p(k). As a result 

n Bfc) < ¥(B k \A k ) = F(P x „ m < j(j>f). 

Applying we get : 

¥(A k nB k )<d( P )fk- 2 , 

where p := 2/(7' — 7 — 07") > 0, and hence ^P(A fc n B k ) < 00. By the Borel-Cantelli 
lemma, we deduce that a.s., for all k sufficiently large, p(fc) > p(k) or the box b(k), which is 
contained in the box b(k), has at least j<j)^ balls when the first V Xk balls have been thrown. 
Now, by LemmaHl we know that a.s., for all integers k sufficiently large, p(k) < p(k). Lemma 
[7J is therefore proved. □ 



We now deduce from Lemma [7J the lower bound of 



n,jn" 



Proof of Proposition [4] : The same calculations performed at the beginning of the proof 
of Proposition [3] show that for almost all u, there exists k\(oS) such that V Xk < \<p k -i \ for 
all k > ki(u). Applying Lemma [7J we deduce that, if we define the event Q by 

Q : = {cu : there exists k 2 {u) such that N^^^j^k) > 1 for all k > k 2 (u>)} , 

then P(fio) = 1- Let to G VLq. The sequence (<fr k ) is increasing and tends to infinity. Let n 
be an integer large enough so that the unique integer k satisfying 4> k -i < n < (f> k is greater 
than k 2 {u). Because N^ h _^j^{k) > 1 and 1] < n, A" nj ^(/c) > 1. Now, jn a < j<p^, so 
N n ,jn a {k) > 1 and H n j n a > k. Further, as n < <fr k , 

k > (1 - a)C* Inn- 7 "(1 - a)C* In k. 
As exp ((k — 1) jfz^jc* ) — ^k-i — n , w e have : k < (1 — a)C* Inn + 1. Thus 
H nJna >k> (l-a)C*lnn-7 ,/ (l-a)C*ln((l-a)C*lnn + l) 

and 

r • f H n ,jn<* ~ (1 - a) C* Inn 

hmmi — — > — 7 (1 — a)C a.s. 

In In n 

The fact that 7" can be chosen arbitrarily close to 2 (i- a )e* completes the proof. □ 
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4.2.2 The case 2 < j < 9* and a = 

We now prove that H n j > Cj Inn + O (In Inn) a.s. if j < 9*. We shall notice that contrary 
to the case j > 9*, the number of boxes matters more than their sizes. 

Proposition 5 Let j < 9* be an integer greater than 1. Then 

v . r H n,j- Inn 1 

lhninf — — — > - T . . a.s. 

n^oo In Inn 21nL(j) 

The main difference with the case j > 9* is that we cannot consider the largest box any more. 
If 9* < oo, one can even show by performing the usual calculations that with probability 
one, for all k > C*lnn + o(lnn), the largest box of the k-th generation contains no ball 
when n balls have been initially thrown. We shall rather focus on some other boxes which 
are smaller but sufficiently numerous so that it is very unlikely that all of them contain less 
than j balls when n balls have been thrown. As we want to prove that H n j is bounded 

from below by ( _ — z) ^n, we consider the situation at the k-th generation when one 

initially throws approximatively exp (k (— lnL(j)/j + e)) balls. 

The boxes that will play a key role are those appearing in Lemma [2] (recall that j < 9*) : 
with probability one, 



k— >oo 



where 



and 



lim Vk e - k ^ j) # {% G N fc : s(Jfe) <Pt< 2s(k)} = Q(j), 

L'(j)' 



s(k) := exp l—k 
1 1-2-i (L"(j) (L'(j) 




QU) : j \m \m 

Notice that, by Lemma HJ as j < 9*, Q(j) > a.s. Define 

u{k) := \Q(j)k- 1/2 e ktfi{j) /2] . 
Then (recall that <f(j) > 0), for almost all lu, there exists k (uj) such that 

# {i G N fc : p, > s(k)} > v(k) > 1, for all k > k (co). (10) 
We can now prove the following result. 
Lemma 8 Define for all k G Z + 

f-lnL{j)\ 1 

Xfc := k exp k , where u > — . 

V 3 J 2j 

Then for almost all u, there exists ki(u) such that 

N Xk ,j(k) > 1, for all k > k\{uS). 
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Proof : As N fc is countable, we can order the boxes of the k-th generation in the decreasing 
order. Denote by Gk the family of boxes having a size at least s(k). We create u(k) new 
boxes : if #Gk — consider the first v{k) boxes belonging to the family Q k , denoted by 
bi(k), . . . , (k) . For all I < u(k), we place an imaginary box bi(k) inside the box bi(k) 
such that a ball fallen in bi(k) is thrown in bj(fc) with probability s(k)/si(k), where si(k) is 
the size of bi(k). In particular, every imaginary box has size s(k). If #Gk < v(k), we denote 
by bi(fc), . . . ,bj,( fc )(/c) the first v(k) boxes. Introduce the events A k := {i^Gk > v(k)} and 

Bk := {V/ < v(k), bj(fc) contains less than j balls when the first V Xk have been thrown} . 

Conditionally on Too and A k , the boxes bj(/c) have size s(k) so the numbers of balls contained 
in bi(k), I < f(A;), when "P,^ balls have been thrown are independent Poisson variables with 
parameters X] C s(k). Therefore 

F(A k n B^) < F(V XkS{k) < jY (k \ 

Thus 

hF(4nfllW <fr./V*)«( P , 1 , (il < J ). 

It can be easily seen that Xks(k) tends to 0, and that, as a result, 

lnF(V Xhs(k) <j)~-4s(ky/jl 
Thus there exists a constant c > such that for all k e Z + , 

lnP(P x . fcs(fc) < j) < -2c4b(A:) j '. 

Finally, we get 

P(A fc n B^) < exp (-og0>" 1/2 e Mfc) 4 s W) = ex P {~oQU)^ 1/2 ) • 

As u > ^ and Q(j) > a.s., we get E l^ n s fe l^o] < °o a.s., so l^ns* < °o a.s. 
Combined with (jlOj) . this proves that a.s., for all integers k sufficiently large, there exists an 
imaginary box containing at least j balls when V Xk have been thrown. As every imaginary 
box is contained in a real box, Lemma M is proved. □ 

We now deduce from Lemma [8] the lower bound of H n j. 

Proof of Proposition 03 : Let u and v! be two real numbers such that ^ < u < vl . Define 
%k '■= k u exp (— fclnL(j)/j) and <pk '■— k u> exp (— khih(j)/j). One can show that for almost 
all u, there exists k 2 {u) such that V Xk < \4>k-i] for all k > k 2 (ui). Applying Lemma El we 
deduce that if we define the event Q by 

fl := {u; : there exists k 3 (u) such that N^ k _^ t j(k) > 1 for all k > ks(u)j , 

then P(fio) = 1. The sequence is increasing and tends to infinity. Let u G Qq. Let n 
be an integer large enough so that the unique integer k satisfying <pk-i < n < 4>k is greater 
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than ks(uj). Because iV[- ( ^ fc _ 1 ] ^(k) > 1 and n > \<pk~i\, N n j(k) > 1, so H n j > k. Moreover, 
as n < (pk, 

As n > <pk~i > exp (—(A; — 1) lnL(j)/j), we have : k < _ hi n + 1. Therefore 
H n a > k > — : - . - In n — u' — , - . - In ( — - J , - In n + 1 



n ^ -InLOV"'" "'-lnLOO^V-lnLOO 
We conclude that 

• r ^ ~~ -lnL(j) , j 

hmmf — — — > -u a.s. 

n-*oo lnln(n) — InL(j) 

The fact that «' can be chosen arbitrarily close to completes the proof. □ 

Remark 4 Theorem [1] states that H n j ~ Cj Inn a.s. This asymptotic behaviour was proved 
in the case 9* = oo (we even showed that H n j = Cj Inn + O (In Inn) a.s.). If 6* < oo, then 
— lnp(fc)/fc tends to 1/C* a.s. (see Lemma[3]). We deduce from an argument similar to that 
in Proposition H] that H n j ~ Cjlnn a.s. Finally, Theorem [1] and Proposition [1] have been 
proved. 



5 Study of the saturation levels 

In this section, we prove Theorem [2] and Proposition [2j We shall first assume that (j2J) holds. 
We shall then study the case = — oo to complete the proof of Theorem El 



5.1 The case — oo < 6* < and = 

Throughout this section, we suppose that (j2J) holds. In particular, C* = — 0*/lnL(#*) = 
— L(#*)/ 1/(0*). We are first interested in the lower bound. We shall then focus on the upper 
bound. We are inspired by the techniques developed in the previous section. 



5.1.1 Lower bound 

We show the inequalities (jSJ) and ((7|) when (j2J) holds. Equation © will be the key tool. 
Proposition 6 Suppose that (TJ|) holds. Let j > 1 be an integer and a G [0, 1). Then 

hminf G "^~ (1 ~ a)ainH > — a . s . 

n->oo lnln(n) lnL(0 + ) 



Recall that G n j n a > k if and only if at generation fc, every box contains at least jn a balls 
when n balls have been thrown. Because we want to show that G nt j n <* is bounded from below 

by ((1 — a)C* — e) ln(n), we take k ~ ((1 — ct)C* — e) ln(n), i.e. n ~ exp ( fc ( + e ] ] . 
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Proof : Let u and u' be two real numbers such that 

1 i ,1/1 

u > — — — and u < u < — \ u + — 

(1 - a)9* a\ 9, 

where the second condition reduces to u' > u for a = 0. Define for all k G N : 



%k '■= k u exp ( k 



and 



:= fe w exp ( k 



[l-a)CJ ™ "V (l-a)C, 

Proceeding as usual, let us prove the following key result : for almost all lu, there exists 
fco(w) such that 

M Xk , m (k) = 0, for all k > k (oj). (11) 

Let x and ?/ be two real numbers such that y > j and fceN. We calculate E [^^(/c)!^]. 
We write 



i:\i\=k 



i:\i\=k 



Now, conditionally on jF fc , (C(i; x))|j|=fc are Poisson variables with parameters xpi, so 

E[M x , y {k)\F k ]= nv xPl < \v\)- 



i:\i\=k 



Applying (jHJ), we get 



Now, as y > j, 



E[M x , y (k)\F k ) < d(-0*)\y]- e * (W*- 



i:\i\=k 



1 2/ 1 < — J/ < — — 



In the notations of Lemma [T], we have : 

E[M x , v {k)\F k ] < d'y- 9 *x e *W*) k W {k) (d* 
where d' := d{-6*)(l + l/j)~ e *. We finally get 

E [A^Jfe)] < d'y- e *x e * L(^) fc . 
For x = Xk and y = j0£ > j, we obtain for every integer k G N : 

E[^W]<^ Ma) - 
Now — u'a) < — 1. As a consequence 



E 



5> 



■M« fc j*g(fc)>l 



< E 



5^ M Xk! j^(k) 



< oo. 



In particular, there is an a.s. finite number of integers k such that M-x k ,j<j)^{k) > 1, which 
proves (fTTT) . 

Performing the same calculations as at the end of the proof of Proposition HI one finally 

gets : 

G n , jn a - (1 - a)C*lnn 



lim inf 

n— >oo 



In Inn 



> —u'(l — a)C* a.s. 



The fact that u' can be chose arbitrarily close to — jjz^tt completes the proof. 
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□ 



5.1.2 Upper bound 

In this section, we are interested in the upper bound of the saturation levels G n j. We prove 
that the inequalities and ([7]) are in fact equalities whenever ((Tj) and l^j hold by studying 
the smallest box, regardless of the value of j; there is no phase transition. 

Proposition 7 Suppose that (TJp and (TJ|) hold. Let j > 1 be an integer and a G (0, 1). Then 

G n j C*lnn 3 

limsup — : < — : — H a.s. 

n ^oo In Inn 21nL(6>*) j 

and 

G n j n a - (1 - a)C*lnn 3 

limsup — < — — 777-- a.s. 

n -*oo In Inn 21nL(0*) 

In order to prove both inequalities simultaneously, we may suppose that a G [0, 1), the case 
G n j corresponding to G n j n c with a = 0. 

By definition, G n j n a < k if and only if at generation k, there exists a box contain- 
ing less than jn a balls when n balls have been thrown. We shall see that it suffices to 
consider the smallest box. As we intend to show that G n j n a is bounded from above by 

((1 — a)C* + e) Inn, we take k ~ ((1 — a)C* + e) Inn, i.e. n ss exp ^/c ^ ( - 1 _^ c , 

Proof : Let 7, 7' and 7" be three real numbers such that : 

• 7 > - 7' > 7 + j and 7" > 7' if a = 0, 

• 7 > "J:' 7' > and 7' < 7" < ^ if a > 0. 
Define 

Xk '■= fc~ 7 exp I k- — — I and <pk := fc -7 exp I k- 



(l-a)Cj ~ ™' (l-a)C* 
Let us prove the following result : for almost all u, there exists ko(u) such that 

M Xk , m {k) > 1, for all fc > k (u). (12) 

Define p(/c) := k"'e~ k ^ c *. We consider an imaginary box b(k) at generation k such that 

1. if 1 < p(k) : every ball thrown is placed in b(k). 

2. if p(k) < p{k) < 1, where p(k) is the size of the smallest box b(k) : every ball fallen 
in b(k) is also placed in the imaginary box and every other ball is placed in b(k) with 
probability (p{k) —p(k))/(l —p(k)). Hence, the imaginary box has size p(/c). 

3. if p{k) < p(k) : every ball fallen in the smallest box is placed in the imaginary box 
with probability p(k)/p(k), and no other ball is placed in b(k). 
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The box b(k) has thus size p(/c) A 1, and whenever p(fc) > p(k), it contains the smallest box. 
From Lemma HI we know that a.s., for all integers k sufficiently large, p(k) > p(k). To prove 
Lemma [T2l all we have to do is therefore to prove that P(limsupAfc) = 0, where the event 
A k is defined by 

A k := {b(/c) contains at least j(p k balls when the first V Xk balls have been thrown}. 

By the Borel-Cantelli lemma, it suffices to show that ^P(Afc) < oo. Let k be an integer 
sufficiently large so that p(/c) < 1. The imaginary box has then size p(k), so we have : 

F(A k ) = P (V Xkp(k) > jfi) 

We would like to apply (JHD- To do so, let p := j if a = and p := 2/(7' — 7 — 0:7") > if 
a > 0. Note that for all integers k sufficiently large, j(f> k > p. Equation (JE]) then ensures 
that : 

W(A k ) < c(p)r P P (a7 " +7 - 7,) . 

As p(aY + 7 - 7') < -1, E F (A0 < 00, which proves ffT2l . 

Performing the same calculations as at the end of the proof of Proposition [31 one finally 

gets : 

i- Gn,jn°> - (1 - a)C*lnn 

hmsup — — < 7 (1 — ct)G* a.s. 

n^oo m In n 

The fact that 7" can be chosen arbitrarily close either to — tJ- + j if & = or to — 2( - 1 _ 3 Q ,) 6) 
if a > completes the proof. □ 

Proposition [2] and a part of Theorem [2] have been proved. We now turn our attention to the 
other part of Theorem El 

5.2 The case 9* = —00 

In this section, we prove that if = —00, then G n j ~ C* Inn a.s. We begin by showing the 
following lemma : 

Lemma 9 If 6* = —00, then — 8/lnL(6) tends to C* as 9 tends to —00. 



Proof : The condition = —00 means that <p{0) > for all 9 < 0. Let ip = InL. Recall 
that C* = lim^-oo —l/ip'{9). As tp is convex decreasing, it is known that —ip{9)/9 tends to 
/ G (0, 00] as 9 tends to —00. We distinguish two cases. 

Either I is finite, then one can easily show that 9 1— > ip(9) + 19 is increasing. As a result, 
ip' > —I. If / 7^ 1/C*, then — / as # tends to —00 , so there exists e > such that 

ip>e-l. Let < 0. Then /° ^ > (e - /) (-0) , i. e. if>(6) + 19 < i/j(0) + e9. Dividing by -0 
and taking the limit, we get e < 0, which is absurd. We deduce that I = 1/C*, which means 
that — 9/\iaL(9) — >• C* as tends to —00. 

Or / is infinite. As for all 9 < 0, p(0) > 0, the function 9 e (-00, 0) h+ 0/^(0) - l/^'(0) 
is increasing. It is also positive. If it does not tend to at —00, then it is bounded from 
below by some e > 0. Multiplying by -i[>'(9) > 0, we get : 1 > 1 - 9^'(9)/^/j(9) > -e^'(0). 
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Consequently ip'{9) > — 1/e and ^(0) — i]){9) > 9/e. Dividing by 9 < 0, we obtain : 
—ip(9)/9 < 1/e — ip(0)/9. Taking the limit, we have oo < 1/e, which is absurd. Finally, 
9/ip{9) — l/ip'{9) tends to as 9 tends to — oo. Now, by definition of C*, ip'(0) tends to 
— 1/C*. As a result, —9/ip(9) tends to C* as 9 tends to — oo. □ 

5.2.1 Lower bound 

We show that G n j > C*lnn + o(lnn) a.s. Equation ([9]) will be very useful. 

Proposition 8 Suppose that 9* = — oo. Let j > 1 be an integer. Then 

G 

liminf — > a.s. 

n^oo In n 

Proof : We follow the usual strategy. Let 9 < and u > — lnL(9)/9. Define for all k G Z + , 
x k := e ku . Applying ([9]), we can show that 

E[M Xk>j (k)\F k ] < d(-9)j- x e k H9) k W^(9), 

so 

E [M Xk>j (k)} < d(-9)j- exp{k9(u + \nL(9)/9)}. 

As 9{u + lnL(6')/6') < 0, E \^2-M Xk ,j(k)] < oo, so a.s., for all integers k sufficiently large, 
M. Xk j{k) = 0. The usual calculations yield : 

liminf^ > ° a.s. 
n-+oo Inn InL(^) 

Lemma P enables us to complete the proof. □ 

5.2.2 Upper bound 

We finally prove that G n j < C* Inn + o(lnn) a.s. As = — oo, Lemma [3] cannot be applied 
(we do not know how the size of the smallest box behaves). Surprisingly, we are inspired by 
the proof of Proposition 

Proposition 9 Suppose that 9* = — oo. Let j > 1 be an integer. Then 

limsup— < a.s. 

n — >oo 1U n 



Proof : We may suppose that j = 1. Let 9 < 0. Here, the boxes of the k-th generation that 
will play a key role are those having a size approximatively s(k), where 

sW := exp . 

Applying Lemma [21 we have with probability one : 

lim Vke~ k ^# {% e N k : s{k)/2 < Pi < s{k)\ = Q{9), 

k^oo 
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where 

Notice that, by Lemma [U Q{6) > a.s. Define 

u(k) := \Q(9)k- 1/2 e kv{e) /2] . 

Then (recall that <f{d) > 0), for almost all u, there exists ko(u) such that 

#{ieN k :pi< s(k)} > v{k) > 1, for all k > k Q {u). (13) 

Let u < — L'(0)/ L(#). Define for all k G Z+, := e fcn . Let us prove that for almost all 
c<j, there exists fci(a>) such that 

M Xk ,i(k) > 1, for all jfe > fci(cj). (14) 

Denote by Q k the family of boxes having a size at most s(k). We define v{k) new boxes : 
if j^Qk > z/ (^) ) we create u(k) imaginary boxes bi(k), . . . ,b u ^(k) of size s(k) contain- 
ing the first v{k) real boxes belonging to the family Q k . If #Q k < v(k), we denote by 
bi (k) , . . . , b u rty (k) the first u(k) boxes of the k-th generation. Introduce the event A k := 
{#Gt > v(k)} and 

Bk '■= {V/ < v{k), bi(k) contains at least one ball when the first V Xk have been thrown} . 

Conditionally on Too and A k , the boxes b/(fc) have size s(k) so the numbers of balls contained 
in bi(k), I < u(k), when V Xk balls have been thrown are independent Poisson variables with 
parameters Xks(k). Therefore 

F(A k n < nv XkS(k) > iy {k) . 

Thus 

lnP^n^l^) < ^r^kP^, > 1). 

Now nV Xks(k) > 1) < x k s(k), so 

HA k n 2^) < exp { + ^ ^)^i/2 e ^ W | 

As u + L'(0)/L(0) < 0, y>(0) > and Q(0) > a.s., we get E[^ ^A k nB k \Foo] < oo a.s., so 
^A k nB k < oo a.s. Combined with (TI3"]) . this proves that a.s., for all integers fc sufficiently 

large, there exists an empty imaginary box when V Xk have been thrown. As every imaginary 

box contains a real box, (|14p is proved. 

Proceeding as usual, we deduce that for all 6 < 0, 

G nA L(fl) 

hmsup < a.s. 

n ^oo Inn -L(0) 

We get the result by letting 6 1 tend to — oo. □ 
Finally, Theorem [2] has been proved. 
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6 An explanation for the phase transition 

In this section, we explain heuristically why a phase transition may occur in the asymptotics 
of H n j but not in those of H nin a , G n j and G n ^ . To do so, we rephrase the setting in terms 
of interval-fragmentations. One can easily construct a family (F(k),k G Z+) of random 
open subsets of (0, 1) with F(0) = (0, 1), which is nested in the sense that F(k') C F(k) 
for all k < k', and such that the set of the lengths of the interval components of F(k) is 
{pi : |z| = k} for every integer k. The boxes correspond to the interval components and 
the balls to a sequence (Ui) i€ ^ of independent random variables uniformly distributed on 
(0, 1) and independent of the fragmentation F. The height H n j corresponds to the least 
integer k such that every interval component of F(k) contains less than j elements of the set 
{Ui, . . . , U n }, and the saturation level G n j to the least integer k such that there exists an 
interval component of F(k) containing less than j elements of the set {U\, . . . , U n }. Roughly 
speaking, the height H n j depends crucially on the minimal length m n ,j of the intervals 
[Ui, U i+ j^i] for 1 < i < n — j + 1, where < Ui < ■ ■ ■ < U n < 1 are the ordered statistics of 
the family {U\, . . . , U n ), whereas the saturation level G n j is related to the maximal length 
m n ,j of the intervals [Ui, U + j] for < i < n — j + 1, where Uq := and U n+ \ := 1. Indeed, 
the height H n j is the first time k when no cluster [Ui, Ui+j-i], 1 < i < n — j + l,of size j (in 
particular the smallest one) is included in an interval component of F(k), and the saturation 
level G n j is the first time k when there exists a cluster [Ui, U + j], < % < n — j + 1, of size 
j + 1 (possibly the largest one) containing an interval component of F(k). 

It is easy to show from Lemma [3] that if we used equidistributed points {j/(n+ 1) : 1 < 
j < n} instead of i.i.d. uniform points, then no phase transition would occur. More precisely, 
all the heights H n j would be equivalent to C*lnn a.s. Further, the heights H nyna would be 
equivalent to (1 — a)C* Inn a.s. and the saturation levels G n j n a to (1 — ct)C* Inn a.s. 

We first explain why the clusters of size n a behave as if the points Ui, 1 < i < n, were 
equidistributed on (0, 1). It will follow that no phase transition occurs in the asymptotics 
of H ntn c, and G n ,n a - To do so, let us first prove that m n n a — n~ l+a+ °^ a.s. Let e > 0. We 
have : 

P(z^ n ,n« < n- l+a - 2e ) < n¥(U na -U x < n" 1+Q " 2e ). 

Let (ej)j 6 pj be a sequence of independent exponential variables with parameters 1. Define 
7^ := 5^i<i<fc for all k > 1. Then we have : 

(U h ...,U n ) = (ji/j n+1 ,...,j n /j n+1 ), (15) 

so 

W Una _ Vl < n -i+«-2*) = P ( lna - 71 < n~ 1+a -A 

V 7n+l / 

< P ( lna - 7i < n 1+£ n- 1+a - 2£ ) + P ( Tn+1 > n 1+£ ) 

= P (V n «-e > n a ) + P (V n l+e < n + 1) 

since the sequence (7i)i S N has the same law as the ordered sequence of points of a standard 
Poisson process. Applying Lemma with p — 3/e, we have for every integer n sufficiently 
large : 

P (P na -s > n a ) + P (V n i+s <n + l)< (c(p) + 2 p d(p))n- 3 . 
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We deduce that that for every integer n sufficiently large, we have : 

P(m ni „ Q < n- 1+a - 2£ ) < (c(p)+2 p d(p))n- 2 . 

Applying the Borel-Cantelli lemma and letting e tend to 0, we easily conclude that m n n « > 
n -i+a+o(i) a g gj m ilar calculations yield : m n ^ n a < n - 1 + a +°( 1 ) a.s. We conclude that the 
clusters of size n a essentially behave as if the points CZj, 1 < i < n, were equidistributed on 
(0, 1), which explains why no phase transition occur in the asymptotics of H n n c and G n ,n Q - 

On the contrary, the clusters of size j behave very differently from the equidistribution 
case. For instance, thanks to the identification (fT5|) . we may apply well-known results on 
extreme values (see e.g. [2S])- One finds that m n j ~ 7i~ jf /( J ~ 1 ), so that m n j is much smaller 
than j/n, even at a logarithmic scale. Furthermore, the larger the integer j is, the closer these 
two quantities are, so that H n j may eventually be equivalent to C* Inn a.s. if j is sufficiently 
large. On the other hand, if the integer j is too small, then H n> j may be greater than 
(l+e)C* lnri+o(lnn) a.s. for some e > : one has to wait for a long time before the smallest 
clusters of size j are no longer included in any interval component of the fragmentation; in 
that case, there is a phase transition. 

Concerning the saturation level G n j, one gets m n j « lnn/n, which differs from the 
equidistribution case only up to a logarithmic factor, and the impact of the latter is asymp- 
totically negligible. It explains why no phase transition occurs in the asymptotics of G n j. 
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